home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Libris Britannia 4
/
science library(b).zip
/
science library(b)
/
MATHEMAT
/
STATISTI
/
0910.ZIP
/
STAT-SAK.DOC
< prev
next >
Wrap
Text File
|
1986-11-15
|
16KB
|
450 lines
STAT-SAK
The Statistician's
Swiss Army Knife
Version 2.1
(c) 1985, 1986
"One of many STATOOLS(tm)..."
by
Gerard E. Dallal
53 Beltran Street
Malden, MA 02148
STAT-SAK is the STATistician's Swiss Army Knife: While not
the custom tool for any particular job, it carries out a wide
variety of miscellaneous tasks that are not easily performed
by most large statistical packages.
NOTICE
Documentation and original code copyright 1986 by Gerard E.
Dallal. Reproduction of material for non-commercial purposes
is permitted, without charge, provided that suitable
reference is made to STAT-SAK and its author.
Neither STAT-SAK nor its documentation should be modified in
any way without permission from the author, except for those
changes that are essential to move STAT-SAK to another
computer.
DISCLAIMER
STATOOLS(tm) are provided "as is" without warranty of any
kind. The entire risk as to the quality, performance, and
fitness for intended purpose is with you. You assume
responsibility for the selection of the program and for the
use of results obtained from that program.
PAGE 2
INSTALLATION
STAT-SAK was written for the IBM-PC but few changes should be
needed to install it on another computer. The first DATA
statement initializes the variables
IIN -- input unit number (screen)
IOUT -- output unit number (screen)
DESCRIPTION
STAT-SAK is a tool for anyone who regularly analyzes data.
It was written under the assumption that users would have
access to a "comprehensive" micro or mainframe statistical
package such as SAS, SPSS-X, BMDP, or SYSTAT. STAT-SAK does
not perform calculations that require access to the original
observations.
STAT-SAK performs the following calculations and analyses:
1. Distributions:
a. Normal distribution
quantile to probability: upper-tail
lower-tail
two-tailed
probability to quantile
[Whenever a quantile of zero is specified,
STAT-SAK prompts for three quantities A,B,C and
then evaluates A/(B/SQRT(C)).]
b. t distribution: quantile to probability
probability to quantile
c. chi-square distribution: quantile to probability
probability to quantile
d. F distribution: quantile to probability
probability to quantile
e. Binomial: Prob (binomial(n,p) <=,=,>= k)
f. Poisson: Prob (Poisson(mean) <=,=,>= quantile)
2. Tests of independence/homogeneity of proportions in two
dimensional contingency tables: Pearson chi-square
statistic, with Yates's continuity correction in the
case of a 2 by 2 table.
3. Fisher's exact test for 2 by 2 contingency tables.
STAT-SAK G.E. Dallal
PAGE 3
4. Mantel-Haenszel test, approximate 95-% confidence
interval for common odds ratio using equation 10.22 of
Fleiss (1981).
5. McNemar's test: An exact test using the binomial(n,0.5)
distribution.
6. Correlation coefficients:
a. Test that a population correlation coefficient is
zero.
b. Construct confidence interval for a single
correlation coefficient using a FORTRAN
translation of the BASIC program of Maindonald
(1984, p.300) based on an approximation from
Winterbottom (1980).
c. Compare two independent correlation coefficients
using Fisher's z transformation.
7. Bartholomew's test for increasing proportions: Exact
P-values are given for 3 or 4 proportions. For 5 or
more proportions, STAT-SAK gives the P-value
appropriate for equal column totals. This value is
NOT conservative for all tables. See Bartholomew
(1959a,b).
Since Bartholomew's test is a pool-adjacent-violators
procedure, it can yield "significant" results even
when the data show no real trend. In the table
85 15 85
15 85 15
the first row proportion decreases from column 1 to
column 2; Bartholomew's test fits the proportion
100 (=85 + 15) / 200 (=100 + 100) to these columns.
The first row proportion then increases significantly
as we move to column 3 and the test statistic is
significant overall. In order to alert users to such
potential pitfalls, STAT-SAK reports the Pearson
goodness-of-fit statistic for homogeneity of
proportions along with the difference between the
Pearson statistic and the Bartholomew statistic.
Although the reference distribution of the difference
is not known, large values are indicative of model
failure. A conservative test can be had using the
central chi-square distribution with degrees of
STAT-SAK G.E. Dallal
PAGE 4
freedom equal to the number columns minus 1.
8. Bartholomew's test for increasing normal means: Exact
P-values are given for 3 or 4 means. For 5 or more,
STAT-SAK gives the P-value based on equal column
totals. This value is NOT conservative in all cases.
The test for increasing means poses the same problem
as does the test for proportions: the data can be
significant relative to the null hypothesis of no
change even if the monotonicity is seriously violated.
Lack-of-fit is assesses by partitioning sums of
squares as in a standard analysis of variance. Exact
tests for lack of fit are unavailable, however, since
the F-ratios constructed in this manner do not follow
central F distributions under their respective null
hypotheses. We must be content with bounds on the
P-value.
Let BSS and WMS be the between group sum of squares
and within group mean square calculated when
performing a standard one-way analysis of variance.
Let OBSS be the between group sum of squares
calculated under order restriction. Let 'k' be the
number of samples and N be the total number of
observations. Since
(BSS-OBSS)/(k-1) / WMS < = [1]
(BSS/(k-1)) / WMS [2]
it follows that an upper bound to the P-value can be
had by comparing [1] to the percentiles of the F
distribution with k-1 numerator degrees of freedom and
N-k denominator degrees of freedom.
A lower bound to the P-value is obtained by assigning
the difference BSS-OBSS to a single degree of freedom,
but it must be noted that in the case of two groups
with equal means, BSS-OBSS equals BSS with probability
1/2 and 0 with probability 1/2. Hence, for k
populations not all of whose means are monotonically
nondecreasing, the probability that (BSS-OBSS)/WMS
exceeds some particular value is no less than HALF
that given by the F distribution with 1 numerator
degree of freedom and N-k denominator degrees of
freedom.
STAT-SAK G.E. Dallal
PAGE 5
The summary statistics may be read from an external
file, one record per sample, each record containing a
sample's mean, SD or SE, and count separated by one or
more spaces. (FORTRAN's list directed input is used to
read the values.)
9. One sample t test from summary statistics.
10. Two sample t test from summary statistics: includes
pooled standard deviation, F-ratio for testing
equality of variances, and t tests based on both equal
and unequal (using Satterthwaite's approximation)
variances. (By entering dummy sample means, this
routine can be used to obtain pooled standard
deviations from individual standard deviations or
standard errors.)
STAT-SAK is a dynamic program. If there are any capabilities
you would like to have added, drop me a note. I'll consider
your suggestions for future versions. The criteria for
inclusion are:
1. unavailable in standard statistical packages or not
easily obtained (meaning it's almost easier to do
by hand)
2. does not require the original observations.
FOR THE FREQUENT USER
At the prompts "Enter 'Q' to quit, press <Enter> to continue"
and "Enter 'R' to return to main menu, press <Enter> to
continue", any valid STAT-SAK command may be entered, thereby
bypassing the display of the menu.
STAT-SAK G.E. Dallal
PAGE 6
ALGORITHMS
STAT-SAK makes use of the following published routines:
Best, D.J. and D.E. Roberts (1975). Algorithm AS 91. The
percentage points of the chi-squared distribution. Appl.
Statist.,24,385-388.
Bhattacharjee, G.P. (1970). Algorithm AS 32. The incomplete
gamma integral. Appl. Statist.,19,285-287.
Cran, G.W., K.J. Martin and G.E. Thomas (1977). Remark
AS R19 and Algorithm AS 109. A remark on algorithms AS
63: The incomplete beta integral, and AS 64: Inverse of
the incomplete beta function ratio. Appl.
Statist.,26,111-114.
Hill, I.D. (1973). Algorithm AS 66. The normal integral.
Appl. Statist.,22,424-427.
Majumder, K.L. and G.P. Bhattacharjee (1973). Algorithm
AS 63. The incomplete beta integral. Appl.
Statist.,22,409-411.
Odeh, R.E. and J.O. Evans (1974). Algorithm AS 70. The
percentage points of the normal distribution. Appl.
Statist.,23,96-97.
and the author's FORTRAN translation of
Pike, M.C. and I.D. Hill (1966). Algorithm 291. Logarithm
of the gamma function. Commun. Ass. Comput. Mach.,9,684.
REFERENCES
Bartholomew, D.J. (1959a). A test of homogeneity for ordered
alternatives. Biometrika,46,36-48.
---------------- (1959b). A test of homogeneity for ordered
alternatives. II. Biometrika,46,328-335.
Fleiss, Joseph L. (1981). Statistical Methods for Rates and
Proportions, 2-nd ed. New York: John Wiley & Sons, Inc.
STAT-SAK G.E. Dallal
PAGE 7
Maindonald, J.H. (1984). Statistical Computation. New York:
John Wiley & Sons, Inc.
Winterbottom, Alan (1980). Estimation of the bivariate
normal correlation coefficient using asymptotic
expansions. Comm. in Statist. Simulation and Computation,
B9, 599-609.
STATOOLS(tm)
STAT-SAK is one of many STATOOLS(tm), a set of stand-alone
programs designed to fill in some of the gaps left by major
statistical program packages such as SAS, SPSS-X, BMDP, and
SYSTAT:
PC-PITMAN, calculates observed significance levels using
recursive relationships to obtain the randomization
(permutation) distribution of a number of statistics without
directly examining all possible permutations of the data. It
performs one and two sample randomization tests and rank
tests in the presence of an arbitrary number of ties in the
data.
PC-SIZE determines the sample size requirements for single
factor experiments, two factor experiments, randomized blocks
designs, paired t-tests, and comparison of proportions. It
can calculate the power of specific sample sizes as well as
determine the sample size needed to achieve specific power.
PC-MULTI performs multiple comparisons using Tukey's honest
significant differences (studentized range statistic).
FORGET-IT produces Forget-it Plots (also known as Two-way
Plots), a graphical device for displaying the interaction
structure of a two-way table.
PC-EMS produces tables of expected mean squares for balanced
experiments using the Cornfield-Tukey algorithm.
PC-PLAN generates randomization plans.
STRUCTR fits structural relations when the ratio of error
variances is known.
STAT-SAK G.E. Dallal
PAGE 8
PC-AIP fits additive-in-the-probits models to two-dimensional
contingency tables with ordered column classifications.
To obtain a set on three diskettes (for the IBM PC and
compatibles running DOS 2.0 or later versions) containing the
source code, executable files, and user's guides, send a
check in the amount of $20 to
Gerard E. Dallal
53 Beltran Street
Malden, MA 02148
STAT-SAK G.E. Dallal